Journal of Mathematical Neuroscience (2014) 4:1 o The Journal of Mathematical Neuroscience 

DOI 10.1186/2190-8567-4-1 a SpringerOpen Journal 



RESEARCH Open Access 



Large Deviations for Nonlocal Stochastic Neural Fields 

Christian Kuehn • Martin G. Riedler 



Received: 22 February 2013 / Accepted: 10 June 2013 / Published online: 17 April 2014 
© 2014 C. Kuehn, M.G. Riedler; licensee Springer. This is an Open Access article distributed under the 
terms of the Creative Commons Attribution License (http://creativecommons.Org/licenses/by/2.0), which 
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is 
properly cited. 

Abstract We study the effect of additive noise on integro-differential neural field 
equations. In particular, we analyze an Amari-type model driven by a Q -Wiener pro- 
cess, and focus on noise-induced transitions and escape. We argue that proving a 
sharp Kramers' law for neural fields poses substantial difficulties, but that one may 
transfer techniques from stochastic partial differential equations to establish a large 
deviation principle (LDP). Then we demonstrate that an efficient finite-dimensional 
approximation of the stochastic neural field equation can be achieved using a Galerkin 
method and that the resulting finite-dimensional rate function for the LDP can have 
a multiscale structure in certain cases. These results form the starting point for an 
efficient practical computation of the LDP. Our approach also provides the technical 
basis for further rigorous study of noise-induced transitions in neural fields based on 
Galerkin approximations. 

Keywords Stochastic neural field equations • Nonlocal equations • Large deviation 
principle • Galerkin approximation 
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1 Introduction 

Starting from the classical works of Wilson/Cowan [64] and Amari [1], there has been 
considerable interest in the analysis of spatiotemporal dynamics of mesoscale models 
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of neural activity. Continuum models for neural fields often take the form of nonlinear 
integro-differential equations where the integral term can be viewed as a nonlocal 
interaction term; see [37] for a derivation of neural field models. Stationary states, 
traveling waves, and pattern formation for neural fields have been studied extensively; 
see, e.g., [20, 29] or the recent review by Bressloff [14] and references therein. 

In this paper, we are going to study a stochastic neural field model. There are sev- 
eral motivations for our approach. In general, it is well known that intra and interneu- 
ron [27] dynamics are subject to fluctuations. Many meso or macroscale continuum 
models have stochastic perturbations due to finite-size effects [38, 61]. Therefore, 
there is certainly a genuine need to develop new techniques to analyze random neural 
systems [50]. For stochastic neural fields, there is also the direct motivation to un- 
derstand the relation between noise and short-term working memory [52] as well as 
noise-induced phenomena [54] in perceptual bistability [62]. Although an eventual 
goal is to match results from stochastic neural fields to actual cortex data [35], we 
shall not attempt such a comparison here. However, the techniques we develop could 
have the potential to make it easier to understand the relation between models and 
experiments; see Sect. 10 for a more detailed discussion. 

There is a relatively small amount of fairly recent work on stochastic neural fields, 
which we briefly review here. Brackley and Turner [11] study a neural field with 
a gain function, which has a random firing threshold. Fluctuating gain functions are 
also considered by Coombes et al. [22]. Bressloff and Webber [15] analyze a stochas- 
tic neural field equation with multiplicative noise while Bressloff and Wilkinson [16] 
study the influence of extrinsic noise on neural fields. In all these works, the focus 
is on the statistics of traveling waves such as front diffusion and the effects of noise 
on the wave speed. Hutt et al. [41] study the influence of external fluctuations on 
Turing bifurcation in neural fields. Kilpatrick and Ermentrout [43] are interested in 
stationary bump solutions. They observe numerically a noise-induced passage to ex- 
tinction as well as noise-induced switching of bump solutions and conjecture that 
"a Kramers' escape rate calculation" [43, p. 16] could be applied to stochastic neural 
fields, but they do not carry out this calculation. In particular, the question is whether 
one can give a precise estimate of the mean transition time between metastable states 
for stochastic neural field equations; for a precise statement of the classical Kramers' 
law; see Sect. 5, Eq. (32). However, to the best of our knowledge, there seems to be 
no general Kramers' law or large deviation principle (LDP) calculation available for 
continuum neural field models although large deviations have been of recent inter- 
est in neuroscience applications [13, 33]. It is one of the main goals of this paper to 
provide the basic steps toward a general theory. 

Although Kramers' law [5] and LDPs [26, 34] are well understood for 
finite-dimensional stochastic differential equations (SDEs), the work for infinite- 
dimensional evolution equations is much more recent. In particular, it has been shown 
very recently that one may extend Kramers' law to certain stochastic partial dif- 
ferential equations (SPDEs) [4, 6, 7] driven by space-time white noise. The work 
of Berglund and Gentz [7] provides a quite general strategy how to "lift" a finite- 
dimensional Kramers' law to the SPDE setting using a Galerkin approximation due 
to Blomker and Jentzen [8]. Since the transfer of PDE techniques to neural fields has 
been very successful, either directly [51] or indirectly [14, 21], one may conjecture 
that the same strategy also works for SPDEs and stochastic neural fields. 
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In this paper, we consider a rate-based (or Amari) neural field model driven by a 
Q -Wiener process W 



dUtix) = 



-aU t (x)+ [ w(x,y)f(U t (y))dy 
Jb 



dt + €dW t (x) 



(1) 



for a trace-class operator Q, nonlinear gain function /, and an interaction kernel 
w\ the technical details and definitions are provided in Sect. 2. Observe that (1) is a 
relatively general formulation of a nonlocal neural field. Hence, we expect that the 
techniques developed in this paper carry over to much wider classes of neural fields 
beyond (1) such as activity -based models. 



Remark 1.1 To avoid confusion, we alert readers familiar with neural fields that the 
nonlinear gain function / in (1) is sometimes also called a "rate function." However, 
we reserve "rate function" for a functional, to be denoted later by /, arising in the 
context of an LDP as this convention is standard in the context of LDPs. 



Our main goal in the study of (1) is to provide estimates on the mean first passage 
times between metastable states. In particular, we develop the basic analytical tools 
to approximate equation (1) as well as its rate function using a finite-dimensional 
Galerkin approximation. By making the rate function as explicit as possible, we do 
not only provide a starting point for further analytical work, but also provide a frame- 
work for efficient numerical methods to analyze metastable states. 

The paper is structured as follows: The motivation for (1) is given in Sect. 3 where 
a formal calculation shows that a space-time white noise perturbation of the gain 
function in a deterministic neural field leads to (1). In Sect. 4, we briefly describe 
important features of the deterministic dynamics for (1) where € = 0. In particular, 
we collect several examples from the literature where the classical Kramers' stabil- 
ity configuration of bistable stationary states separated by an unstable state occurs 
for Amari-type neural fields. In Sect. 5, we introduce the notation for Kramers' law 
and LDPs and state the main theorem on finite-dimensional rate functions. In Sect. 6, 
we argue that a direct approach to Kramers' law via "lifting" for (1) is likely to fail. 
Although the Amari model has a hidden energy-type structure, we have not been 
able to generalize the gradient- structure approach for SPDEs to the stochastic Amari 
model. This raises doubt whether a Kramers' escape rate calculation can actually be 
carried out, i.e., whether one may express the prefactor of the mean first-passage in 
the bistable case explicitly. Based on these considerations, we restrict ourselves to 
just derive an LDP. In Sect. 7, the LDP is established by a direct transfer of a re- 
sult known for SPDEs. The disadvantage of this approach is that the resulting rate 
function is difficult to calculate, analytically or numerically, in practice. Therefore, 
we establish in Sect. 8 the convergence of a suitable Galerkin approximation for (1). 
Using this approximation, one may apply results about the LDP for SDEs, which 
we carry out in Sect. 9. In this context, we also notice that the trace-class noise can 
induce a multiscale structure of the rate function in certain cases. The last two ob- 
servations lead to a tractable finite-dimensional approximation of the LDP and hence 
also an associated finite-dimensional approximation for first-exit time problems. We 
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conclude the paper in Sect. 10 with implications of our work and remarks about future 
problems. 

2 Amari-Type Models 

In this study, we consider stochastic neural field models with additive noise of the 
form 



for i€BcR^,a small parameter € > 0, and t > 0, where B is bounded and closed. 
In (2) the solution U models the averaged electrical potential generated by neurons at 
location x in an area of the brain B. Neural field equations of the form (2) are called 
Amari-type equations or a rate-based neural field models. The equation is driven 
by an adapted space-time stochastic process W t (x) on a filtered probability space 
T , (J>)r>o, P). The precise definition of the process W will be given below. 
The parameter a > 0 is the decay rate for the potential, w \ B x B ^Ris a ker- 
nel that models the connectivity of neurons at location x to neurons at location y. 
Positive values of w model excitatory connections and negative values model in- 
hibitory connections. The gain function / : R -> R + relates the potential of neu- 
rons to inputs into other neurons. Typically, the gain functions are chosen sigmoidal, 
for example, (up to affine transformations of the argument) f(u) = (1 + e _M ) _1 or 
f(u) = (tanh(w) + l)/2. These examples of gain functions are bounded, infinitely of- 
ten differentiable with bounded derivatives. However, throughout the paper, we only 
make the standing assumption that 

(HI) the gain function / is globally Lipschitz continuous on R. 

We may transfer Eq. (2) into the Hilbert space setting of infinite-dimensional stochas- 
tic evolution equations [23, 56] for the Hilbert space L 2 (B). Subsequently, brackets 
(-, •) always denote the inner product on this Hilbert space. Moreover, we introduce 
the following notation. Firstly, F denotes the nonlinear Nemytzkii-operator defined 
from /, i.e., F(g)(x) = f(g(x)) for any function g e L 2 (B). The condition (HI) 
implies that F : L 2 (B) -> L 2 (B) is a Lipschitz continuous operator. Often, spatially 
continuous solutions to (2) are also of interest, and thus we note that the Nemytzkii- 
operator also preserves its Lipschitz continuity on the Banach space C(B) with its 
norm ||g||o = sup x€ # I #00 1 due to B being bounded. 1 Secondly, the linear operator 
K is the integral operator defined by the kernel w 



Throughout the paper, we assume that 



We note that the boundedness assumption on the domain B in this study is only necessary when dealing 
with results in the space C(B) as is the appropriate space for the LDP results. All other results in this 
paper which only deal with the space L 2 (B), e.g., existence of solutions and convergence of the Galerkin 
approximation, are also valid for unbounded spatial domains. 



dU t (x)= -ctU t (x) + 



w(x,y)f(U t (y))d 



'y dt + €dW t (x) 



(2) 




(3) 



4^ Springer 



Journal of Mathematical Neuroscience (2014) 4:1 



Page 5 of 33 



(H2) the kernel w is such that K is a compact, self-adjoint operator on L 2 (B). 

We note that an integral operator is self-adjoint if and only if the kernel is symmetric, 
i.e., w(x, y) = w(y, x) for all x, y e B. A sufficient condition for the compactness of 
K is, e.g., II w|Il 2 (23 Xj 8) < oo in which case the operator is called a Hilbert-Schmidt 
operator. Since B is bounded, the continuity of the kernel w on B x B implies the 
compactness of K considered an integral operator on C(B). 

Then we rewrite Eq. (2) as an Hilbert space- valued stochastic evolution equation 

dU t = [-aU t + KF(U t )]dt + edW t , (4) 

where W is an L 2 (B) -valued stochastic process. Interpreting the original equation in 
this form, we now give a definition of the noise process assuming that 

(H3) W is a g -Wiener process on L 2 (B), where the covariance operator Q is a 
nonnegative, symmetric trace class operator on L 2 (B). 

For a detailed explanation of a Hilbert space- valued Q -Wiener process and its co- 
variance operator, we refer to, e.g., [23, 56]. As the operator Q is nonnegative, sym- 
metric, and of trace class there exists an orthonormal basis of L 2 (B) consisting of 
eigenfunctions Vi and corresponding non-negative real eigenvalues kf, which satisfy 
X^i A.? < oo . It then holds that the Q -Wiener process W satisfies 

oo 

W t = ^iP>i^ (5) 

i=l 

where ft 1 are a sequence of independent scalar Wiener processes (cf. [56, Propo- 
sition 2.1.10]). The series (5) converges in the mean-square on C([0, T],L 2 (B)). 
Furthermore, a straightforward adaptation of the proof of [56, Proposition 2.1.10] 
shows that convergence in the mean-square also holds in the space C([0, T], C(B)) 
for every T > 0 if G C(B) for all i (corresponding to nonzero eigenvalues) and 

sup xe BlE~i^ u iW 2 l <0 °- 

The existence and uniqueness of mild solutions to (4) with trace class noise for 

given initial condition Uo e L 2 (B) is guaranteed under the Lipschitz condition on /, 

cf. [23], and we can write the solution in its mild form 

Ut=er at U 0 + I \- a{t - s) KF(U s )ds+ f e-^-^dWs. (6) 
Jo Jo 

The solution possesses a modification in C([0, T], L 2 (B)) and from now on we al- 
ways identify the solution (6) with its continuous modification. It is worthwhile to 
note that for cylindrical Wiener processes — and thus in particular space-time white 
noise — there does not exist a solution to (4). This contrasts with other well-studied 
infinite-dimensional stochastic evolution equations, e.g., the stochastic heat equation. 
Due to the representation of the solution (6), it follows that a solution can only be 
as spatially regular as the stochastic convolution f Q e~ a ^~ s ^dW s . In the present case, 
the semigroup generated by the linear operator is not smoothing in contrast to, e.g., 
the semigroup generated by the Laplacian in the heat equation. Thus, the stochastic 
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convolution is only as smooth as the noise which for space-time white noise is not 
even a well-defined function. To be more specific, for cylindrical Wiener noise, the 
series representation of the stochastic convolution (cf. see Eq. (8) below) does not 
converge in a suitable probabilistic sense. 

We next aim to strengthen the spatial regularity of the solution (6), which will 
be required later on. According to [23, Theorem 7.10] the solution (6) is a continu- 
ous process taking values in the Banach space C(B) if the initial condition satisfies 
uo e C(B), the linear part in the drift of (4) generates a strongly continuous semi- 
group on C(JS), the nonlinear term KF is globally Lipschitz continuous on C(B), and 
finally, if the stochastic convolution is a continuous process taking values in C(B). 
It is easily seen that the first conditions are satisfied and sufficient conditions for the 
latter property are given in the following lemma. 



Lemma 2.1 Assume that the orthonormal basis functions vt are Lipschitz continuous 
with Lipschitz constants Li such that 



sup 

xeB 



< oo, sup 

xeB 



5>?t>/(*) 

i=\ 

for a p e (0, 1). Then the process 

O(xj-) := I Q- a(t - s) dW s (x) 



£^>K*) |2(1 - P) 



;=1 



< OO 



(7) 



ft * ft 

(,t):= e- a( '- s) dW s (x) = Y j X i c~ a 
Jo i=l Jo 



dp'Mx) 



(8) 



possesses a modification with y -Holder continuous paths in R + x B for all y e 
(O.p/2). 



Proof We prove the lemma applying the Kolmogorov-Centsov theorem (cf. [23, The- 
orem 3.3 and Theorem 3.4]). Throughout the proof, C is some finite constant, which 
may change from line to line, but is independent of x, y e B and t,s > 0. We start 
showing that the process O is Holder continuous in the mean- square in each direction. 
As Vi are assumed continuous these are pointwise uniquely given and each 0(t,x) 
is for fixed x eB and t > 0 a Gaussian random variable due to YliZi kfvi (x) 2 < oo. 
Hence, for all 0 < s < t and all x, y e B, we obtain 



E\0(x,t) - 0(y,t)\ 



oo t 



-2a(t-s) 



ds\Vi(x) - Vi(y)\ 



< Csup 

zeB 



2t 2 P\„ |2(l-p) 



i=\ 



\x-y\ 



using 



\vi(x) - Vi(y)\ 2 < L] p \x - yf\x - y\ p (\vi(x)\ + \ Vi (y)\) 



2(1-/0) 
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for every p e [0, 1] and \x — y\ p < diam(S) /0 . Next, for the temporal regularity we 
obtain 

E\0(x,t) - 0(x,s)\ 2 

oo t oo s 

= Tkj Vi (x) 2 e-^-^dr + Tkjviix) 2 le" 0 ^ - G ~ a ^\ 2 dr 
^ ? 9 /l-e-^-^\ 



' (1 e -a(f-j))2 ( e -af e -^)2 



2a 

;=1 

As the exponential function on the negative half-axis is Holder continuous for every 
pe [0,1], it holds 

oo 

E\0(x,t) - 0(x,s)\ 2 <C p ^ktvi(x) 2 \t-s\ p . 

i=l 

Thus, overall Jensen's inequality yields E\0(x,t) — 0(y,s)\ 2 < C p (\x — y\ 2 + 
\t — s\ 2 ) p l 2 . Since the difference 0(x, t) — 0(y,s) is centered Gaussian, it further 
holds that 

E|0(jc, t) - 0(y, s)\ 2m < C Ptm (\t -s\ 2 + \x - y\ 2 ) mp/2 Vm e N. 
Now, the Kolmogorov-Centsov theorem implies the statement of the lemma. □ 

We present an example to illustrate the type of noise we are generally interested 
in. Further motivation is provided in Sect. 3. 

Example 2.1 Consider the neural field equation on a d-dimensional cube B = 
[0, 27t] d with noise based on trigonometric basis functions of L 2 ([0, 2n] d ). This type 
of noise is almost ubiquitous in applications as for the stochastic heat equations the 
basis functions can be chosen such that the usual (Dirichlet, Neumann or periodic) 
boundary conditions are preserved. For the example of noise preserving homoge- 
neous Neumann boundary conditions, the basis functions are 

d 

Vi(x) = Y\e ik (x k ), (9) 

k=i 

where x = (x\ , . . . , Xd), i = (h , . . . , ik) is a multi-index in N d and the functions e\ k 
are given by 



euixk) ■ 



m> ^ = 0 ' 



-j^cos(i k x k /2), i k > 1. 
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The functions i>j are for all i G pointwise bounded by jt d l 2 and Lipschitz con- 
tinuous with Lipschitz constants given by Li = Tt- d ' 2 \i \ (cf. [8, Lemma 5.3]). Next, 
we construct a trace class Wiener process from these basis functions. A particular im- 
portant example of spatiotemporal noise is smooth noise with exponentially decaying 
spatial correlation length [15, 36, 43], i.e., 



EW t (x)W s (y) =min{M 



1 



(2^" eXP V 4 ^ 
+ correction on the boundary 



7t_\x-y\ 
4 



(10) 



for a parameter § > 0 modeling the spatial correlation length. Note that for § — >► 0 this 
noise process approximates space-time white noise. Following [60], we can calculate 
under the assumption that § <^ 2n the coefficients X 2 such that the Q -Wiener process 
(5) possesses the correlation function (10) and obtain 



Xj = exp 



4tt 



(11) 



Now, it is easy to see that for this choice of eigenvalues the noise is of trace class and 
moreover the additional conditions of Lemma 2.1 are satisfied: As the functions Vi 
are bounded, we obtain 



sup 



< TV + Tt 



E E 



^ _ —d . _ —d 
< TV + 7T 



E 

N=0 



exp 



< 00 



and the second condition of (7) is satisfied as 



-% 2 N 2 
4tt 



exp 



N=l ie{0,...,N} d \{0,...,N-l} d 



>iV— 1 



4tt 



sup 



|2(l-p) 



< 7T~ 



'I] I] eX P 

^=li€{0,...,A^} J \{0,...,A^-l} rf 



7V=0 \ n / 



S 2 \i\ 
Ait 



\2 P 



< 00. 



3 Gain Function Perturbation 

Another motivation for the considered additive noise neural field equations stems 
from a (formal) perturbation of the gain function / with space- time white noise. 
Let W denote space time white noise and consider the randomly perturbed Amari 
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equation 

d t U(x,t) = -aU(x,t) + I w(x,y)(f(U(y,t))+€W(y,t))dy. (12) 

Jb 

Recall that, by assumption (H2), the integral operator K defined by the kernel w is 
a self-adjoint compact operator. Thus, the spectral theorem implies that K possess 
only real eigenvalues A.,-, i e N, and the corresponding eigenfunctions Vi form an 
orthonormal basis of L 2 (B). If additionally we assume that 

(H4) K is a Hilbert-Schmidt operator on L 2 (B), that is, II ^ II z. 2 (i3xi3) < °°> 

then the eigenvalues satisfy YliLi < 00 • Hence, K possesses the series represen- 
tation 

oo 
i=l 

which yields for the perturbed equation (12) the representation 
d t U(x, t) = -aU(x, t) + J ki Qf f(U(y, t)) Vi (y)dy + € (W(f, 0, «i «• 

Next, note that the random variables = (VV(-, t), Vi) form a sequence of indepen- 
dent scalar white noise processes in time. Therefore, the perturbed equation becomes 

» oo 

d t U(xj) = -aU(x,t)+ / w(x 9 y)f(U(y 9 t))dy + €j2 k ifi v iW' 
Jb i=l 

Rewriting this equation in the usual notation of stochastic differential equations we 
obtain 



tU t (x)+ [ w(x,y)f(U t (y))dy 
JB 



dt + edW t (x), (13) 



dU t (x) = 
where 

oo 

i=l 

is a trace-class Wiener process on the Hilbert space L 2 (B). Note, when comparing to 
(5) here the coefficients kf may be negative, however, as — ft 1 is also a Wiener process 
this slight inconsistency can be neglected. 

We next want to discuss spatial continuity of the solution to this equation with its 
particular noise structure. It is clear that this should translate into smoothing condi- 
tions of the kernel w. Due to Lemma 2.1, it is sufficient to establish conditions (7): 
First, it holds that 

oo oo / „ v 2 co 

^tfvi(x) 2 = ^2y I w(x,y)vi(y)dy\ = J^(w(x, •), v t ) 2 = \\w(x, •) f L 2 {B) 
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due to Parseval's identity. Hence, the first condition of (7) becomes 

SUp I W(X, •) || L 2(£) < 00 ' (14) 

xeB 

Next, the basis functions are continuous if the kernel w(x, y) is continuous in x and 
as the minimal Lipschitz constant is given by the supremum on the derivatives we 
obtain 



Li = sup 

xeB 



^-V x / w(x,y)vi(y)dy 
M Jb 



1 

< 



sup\\V x w(x,.)\\ L2B 
\ k i\ xeB 



due to the Cauchy-Schwarz inequality. Therefore, the second condition in (7) is sat- 
isfied if 

sup||V x w;(x, -)|| L 2(5) < oo and 
xeB 

(15) 

^|AH 2(1 -^|^-W| 2(1 " P) <M VxeB 
i=i 

for ape (0, 1) and a M < oo. The condition (14) and the first part of (15) are easily 
checked but for the second part of (15) usually theoretical results on the speed of 
decay of the eigenvalues have to be obtained. We note that (15) is certainly satisfied 
with p = 1/2 if K is a trace class operator and the eigenfunctions are pointwise 
bounded independently of i; see, e.g., Example 2.1. 



4 Deterministic Dynamics 

The classical deterministic Amari model, obtained for e = 0 in (2), is 

d t U(x,t) = -aU(x,t)+ / w(x, y)f(U(y,t))dy. (16) 

Jb 

where Bcl rf . Note that we may allow B to be unbounded for the deterministic 
case as solutions of (16) do exist in this case [55]. Suppose there exists a stationary 
solution U* = U*(x) of (16). To determine the stability of U* consider U(x,t) = 
U*(x) + \l/(x,t). Substituting into (16) and Taylor-expanding around U* yields the 
linearized problem 

d t x/f(x, t) = -ax/f(x, t)+ [ w(x, y)(Df)(U*(y))lr(y, t)dy. (17) 
Jb 

Hence, the standard ansatz xj/ (x , t) = xj/oix)^ leads to the eigenvalue problem 

(/x + a)1r 0 (x) = f w(x, y){Df)(U*{y))^{y)dy := (/>o)« or 

^ JB (18) 

Cx[/ 0 = nxifo. 
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The linear stability condition fi < 0 is equivalent to r} < a where r) e spec(X). The 
stability analysis can be reduced to the understanding of the operator C. However, 
this is a highly nontrivial problem as the behavior depends upon B, U*(x), w(x,y), 
and f(u). 

An LDP and Kramers' law are of particular interest in the case of bistability. 
Therefore, we point out that there are many situations where (16) does have three 
stationary solutions: U±(x), which are stable and Uq(x) which is unstable. The fol- 
lowing three examples make this claim more precise. 

Example 4.1 The first example is presented by Ermentrout and McLeod [29]. Let 
B = R, w(x, y) = w(\x — y\), a = 1 and assume that 0 <U(x,t) < 1. Furthermore, 
suppose that / e C\[0, 1], R) with f > 0 and 

f(U):=-U + f(U) (19) 

has precisely three zeros U = 0, a, 1 with 0 < a < 1. The additional conditions 
f'(0) < 1 and f'(l) < 1 guarantee stability of the stationary solutions U = 0 and 
U = 1. As an even more explicit assumption [29, p. 463], one may consider a Dirac 
8 -distribution for w in (16), which yields 

d t U(x, t) = -U(x, t) + F(U(x, 0). (20) 

Suppose there are precisely three solutions for U = F(U) given by U = 0, a, 1 with 
0 < a < 1. If F f (0) < 1, F'(\) < 1 and F\a) > 1 then (20) has an unstable stationary 
solution between the two stable stationary solutions. 

Example 4.2 An even more concrete example is given by Guo and Chow [39, 40]. 
They assume B = R, w(x, y) = w(x — y), a = 1 and fix two functions 

f(u) = [b(u - u b ) + l]H(u - u b ), w(x) = Ae - ^ 1 - e w 

where H (•) is the Heaviside function and b, a, A, and u b are parameters. Depending 
on parameter values, one may obtain three constant stationary solutions exhibiting 
bistability as expected from Example 4.1. However, there are also parameter values 
so that three stationary pulses exhibiting bistability exist. 

Note that the choice B = R is not essential to obtain two deterministically- stable 
stationary states U±(x) and one deterministically-unstable stationary state Uq(x). 
The important aspect is that certain algebraic equations, such as U = f(U) and U = 
F(U) in Example 4.1, have the correct number of solutions. Furthermore, one has to 
make sure that the sign of the nonlinearity / is chosen correctly to obtain the desired 
deterministic stability results for the stationary solutions. Hence, we expect that a 
similar situation also holds for bounded domains; see also [63]. 

Examples 4.1-4.2 are typical for many similar cases with x e R or x e R 2 . Many 
results on existence and stability of stationary solutions are available; see, e.g., [1, 46, 
51, 52], and references therein. 



Springer 



Page 12 of 33 



C. Kuehn, M.G. Riedler 



Example 4.3 As a higher-dimensional example, one may consider the work by Jin, 
Liang, and Peng [42] who assume that w(x, y) = w(x — y), a = 1, B = R d , and 

Zoo = / w(x)dx < oo, a: Zoo > 1, 

where k is the Lipschitz constant of / e C^R^R). Furthermore, suppose f is 
uniformly continuous and 

/ / ([/)Z 00 < 1 for U G (-oo, t/i) U (t/ 2 , oo), 
f'(U)Z OQ = l for U€{UuU 2 }, 
f(U)Z OQ >\ for Ue(U u U 2 ), 

for U\ < 0 < t/2. Then [42, Proposition 11] the conditions 

-I/i + /(t/i)Zoo<0 and -U 2 + f(U 2 )Z oo >0 

yield three stationary solutions U+, U* and £/q . The solutions C/^ are stable and 
satisfy t/ * < 0 and U* > 0. The solution Ufi is unstable. 

Although we only focus on stationary solutions, it is important to remark that the 
techniques developed here could — in principle — also be applied to traveling waves 
U(x, t) = U(x — st) for s > 0. The existence and stability of traveling waves for 
(16) has been investigated for many different situations; see, e.g. [12, 14, 21, 29], 
and references therein. However, it seems reasonable to restrict ourselves here to the 
stationary case as even for this simpler case an LDP and Kramers' law are not yet 
well understood. 



5 Large Deviations and Kramers' Law 

Here, we briefly introduce the background and notation for LDPs and Kramers' law 
needed through the remaining part of the paper; see [26, 34] for more details. Con- 
sider a topological space X with Borel a -algebra Bx- A mapping / : X —> [0, oo] 
is called a good rate function if it is lower semicontinuous and the level set {h : 
1(h) < a} is compact for each a e [0, oo). Sometimes the term action functional 
is used instead of rate function. Consider a family {/x 6 } of probability measures on 
(X, Bx) - The measures {/x 6 } satisfy an LDP with good rate function / if 

- mfl <liminf<3 2 ln/i 6 (r) < limsupe 2 In u 6 (r) < -MI (21) 

holds for any measurable set r C X\ often infima over the interior r° and closure 
f coincide so that lim inf and lim sup coincide at a common limit. One of the most 
classical cases is the application of (21) to finite-dimensional SDEs 

du t = g(u t )dt + €G(u t )dfi t (22) 
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where u t e R N , g : R N -> R N , G : R N -> R Nxk , & = (#,..., is a vector 
of k independent Brownian motions and we shall assume that the initial condition 
uo e R N is deterministic. If we want to emphasize that u t depends on €, we shall also 
use the notation u € t . The topological space is chosen as a path space 

X := C 0 ([0, T], R N ) = {0 e C([0, T], R N ) : 0(0) = u 0 }. 

To state the next result, we also need the Sobolev space 

H± := {0 : [0, T] -> R N : 0 absolutely continuous, 0' e L 2 , 0(0) = 0}. (23) 

Furthermore, we are going to assume that the diffusion matrix Q(u) := G(u) T G(u) e 
R NxN is positive definite. 

Theorem 5.1 ([26, 34]) The SDE (22) satisfies the LDP (21) given by 
-MI < liminf6 2 lnP((^V rn T1 e T) 

< limsup€ 2 lnP((wf) ?G[0 r] Gf)< -inf/ (24) 

/or any measurable set of paths r C X with good rate function 
7(0) = / [O ,r](0) 

Ji 



1 /cfWr' - g{<t>t)) T ®(.4>tT\4>' t - g(<Pt))dt, if 4, e « 0 + < 

-oo, otherwise. 



(25) 



An important application of the LDP (24) is the so-called first-exit problem. Sup- 
pose that u t starts near a stable equilibrium u* eT> C R N of the deterministic system 
given by setting € = 0 in (22), where V is a bounded domain with smooth boundary. 
Define the first-exit time 

x € v :=inf{f >0: ^£>}. (26) 

To formalize the application of the LDP, define the mapping 

Z(u, v; s) := inf{/(0) : 0 e C([0, s], R*), 0 O = u,(p s = v} (27) 

which is the cost for a path starting at w to reach v in time s. Next, assume that P is 
properly contained inside the (deterministic) basin of attraction of w*. Then one can 
show [34, Theorem 4.1, p. 124] that 

lim 6 2 lnP(4 < ?|u = w 0 ) = inf{Z(w, v; s) : s e [0, f], v £ V). (28) 

To get more precise information on the exit distribution, one defines the function 

Z(u*, v) = inf Z(u*, v; t) 
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which is called the quasipotential for u* . It is natural to minimize the quasipotential 
over dV and define 

Z := inf Z(w*, v). 

vedV v 7 

Theorem 5.2 ([34, Theorem 4.2, p. 127], [26, Theorem 5.7.11]) For all initial con- 
ditions u G V and all 8 > 0, the following two limits hold: 

lim P(e ( ^- 5)/62 < < e (2+5)/€ Vo = n*) = 1, (29) 

lim € 2 lnE[r^|w 0 = «*1 = Z. (30) 

If the SDE (22) has a gradient structure with identity diffusion matrix, i.e., 

g(u) = -WV(u) for V : ^ R, and G(u) = Id g R NxN (31) 

then one can show [34, Sect. 4.3] that the quasipotential is given by Z(w*, v) = 
2(V(v) — V(u*)). If the potential has precisely two local minima u± and a saddle 
point u* with N — 1 stable directions so that the Hessian V 2 V (wj) has eigenvalues 

pi(w*) <0<p 2 «) < ••• <Pn(u*) 

then one can even refine Theorem 5.2. Suppose uo = u*L then the mean first passage 
time to u+ satisfies 

E[inf{r >0: \u t -u\ || 2 <<$}] 

^ 27r / 1 det ( v2 ^( M P)i c 2(y( M *)-F(^_))/6 2 (32) 
|pi(«*)IY det(V 2 V(*/*)) 

where || • ||2 denotes the usual Euclidean norm in M. N . The formula (32) is also known 
as Kramers' law [5] or Arrhenius-Eyring-Kramers' law [2, 31, 45]. Note that the key 
differences with the general LDP (29) for the first-exit problem are that (32) yields a 
precise prefactor for the exponential transition time and uses the explicit form of the 
good rate function for gradient systems. It is interesting to note that a rigorous proof 
of (32) has only been obtained quite recently [9, 10]. 



6 Gradient Structures in Infinite Dimensions 

The finite-dimensional Kramers' formula (32) applies to SDEs (22) with a gradient- 
structure g(u) = — W(u) where V : R N -> R is the potential. A generalization of 
Kramers' law has been carried over to the infinite-dimensional case of SPDEs given 
by 

dU = [AU -h'(U)]dt + €dW(x,t) (33) 
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for U = U(x, t),x e B C R, B a bounded interval, h e C k (R, R) for suitably large 
& G N and W(x,t) denotes space-time white noise and either Dirichlet or Neumann 
boundary conditions are used [4, 6, 7]. A crucial reason why this generalization works 
is that the SPDE (33) has a gradient-type structure [32] given by the energy functional 



V[U] :=J ^U'(x) 2 + h(U(x)) 



dx. (34) 



More precisely, when 6=0 one obtains from (33) a PDE, say with Dirichlet boundary 
conditions, 

dU = [AU -h'(U)]dt, U(x) = 0 ondB (35) 

for a given sufficiently smooth initial condition U(x, 0) = Uo(x) G C*(R, R). Stan- 
dard parabolic regularity [30, Sect. 7.1] implies that solutions U of (35) lie in the 
Sobolev spaces Hq(B). Computing the Gateaux derivative in this space yields 

V Z V[U] = [ [-U"(x) + h'(U(x))]z(x)dx. (36) 

Jb 

The Gateaux derivative is equal to the Frechet derivative VV = DV by a standard 
continuity result [25, p. 47]. Hence, (36) shows that the stationary solutions of (35) 
are critical points of the gradient functional V. Since the gradient structure of the 
deterministic PDE (35) is a key structure to obtain a Kramers' -type estimate for the 
SPDE (33), we would like to check whether there is an analogue available for the 
deterministic Amari model (16). 

We shall assume for simplicity that / G BC 1 (R) for the calculations in this section. 
Although this is a slightly stronger assumption than (HI), we shall see below that 
even with this assumption we are not able to obtain an immediate generalization of 
(36). Using a direct modification of the results in [55], it follows that the deterministic 
Amari model (16) has solutions U(x, t) in the Holder space BC a (B) x BC a ([0, T]) 
for a g (0, 1] and B C R^ is the usual domain we use for the Amari model. Now 
consider the analogous naive guess to (36) given by 



v[u] := f 8 [^ u(x)2 ~ JbL iy) f{r)w{x ' y)drdy 



dx. (37) 



Computing the derivative in BC a (B) yields 

V Z V[U] = Jim i -(y[U + &z\ - V[U]), 

8^-0 0 



= [ aU(x)z(x) - [ f(U(y)) 
JbL Jb 



w(x, y)z(y)dy 



dx. (38) 



Therefore, setting V Z V[U] = 0 is not equivalent to the solution of the stationary 
problem 



iU(x)+ f w(x,y)f(U(y))dy = 0. 
Jb 
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Due to the presence of the different terms z(x) and z(y) in (38), one may guess that 
the modified functional 



V[U]:= [ ^-U{x) 2 - l -( f(U(y))f(U(x))w(x,y)dy 
could work. However, another direct computation shows that 



dx 



(39) 



V Z V[U]= [ [aU(x)z(x)dx]-]-\ [ [ f(U(x))Df(U(y))w(x,y)z(y)dydx 
Jb ^Ubjb 

-l[LL m,))D/(U(x)) 



)w(x, y)z(x)dydx 



= a(U,z)-[KF(U),Df(U)z). 



Hence, / and its derivative Df both appear instead of the desired formulation; by a 
similar computation one can show that replacing f(u(-)) in (39) by f(r)dr fails 
as well. Hence, there does not seem to be a natural generalization for the guess for the 
gradient functional (34). However, one has to consider possible coordinate changes. 
The idea to apply a preliminary transformation has been discussed, e.g., in [28, p. 2] 
and [51, p. 488]. Assume that 



/ 1 =: g exists and g' ^ 0. 



(40) 



Define P(x,t) := f(U(x,t)) as the mean action-potential generating rate so that 
U = g(P). Observe that 



d t P(y,t) 



l 



g f (P(x,t)) 



-ag(P(x,t))+ / w(x,y)P(y,t)dy 
Jb 



(41) 



For this equation, the problem observed in (39) should disappear as the integral only 
contains linear terms. One may define an energy-type functional 



E[P] 



■=f\f 



P(x) 



ag(r)dr -\ I w(x,y)P(y)P(x)dy 
2 Jb 



dx. 



Calculating the derivative yields 



1 



V Q E[P] = lim -{E[P + SQ]- E[P]) 

8^-0 0 



= [ ag{P(x))Q(x)dx 
JB 

--j B f s W(x,y){P(y)Q(x) + P(x)Q(y)}dyd* 
= {<xg(P), Q) ~ |jT w(x, y)P{y)dy, Qj. 



Springer 



Journal of Mathematical Neuroscience (2014) 4:1 



Page 17 of 33 



This shows that there is hidden energy-type flow structure in the Amari model for the 
assumptions (40) so that 

d t P(x, t) = - 1 - VE[P(x, t)]. (42) 

However, even with this variable transformation, there seems to be little hope to de- 
rive a precise Kramers' rule for the stochastic Amari model (2) by generalizing the 
approach for SPDE systems [4, 6, 7]. The problems are as follows: 

- There is still a space-time dependent nonlinear pref actor l/g'(P(x, t)) in (42) for 
the deterministic system, so the system is not an exact gradient flow for a potential. 

- Applying the change-of- variable P t (x) := f(U t (x)) for the stochastic Amari 
model (2) requires an Ito-type formula so that 



dP t (x) 1 



-ag(P t (x))+ [ w(x,y)P t (y)dy + 0(€ 2 ) 
Jb 



dt 



+ €M(P t (x))dW t (x), (43) 

where M(P t (x)) is now a multiplicative noise term; see [24], and references 
therein for more details on infinite-dimensional Ito-type formulas. The higher- 
order term G(€ 2 ) in the drift part of (43) is not expected to cause difficulties but a 
multiplicative noise structure definitely excludes the direct application of Kramers' 
law. 

- Even if we would just assume — without any immediate physical motivation — that 
the noise term in (43) is purely additive edW t (x), there is a problem to apply 
Kramers' law since we do not have a structure like in (22) with G(-) = Id as W t (x) 
is a Q -Wiener process defined in (5) and driving space-time white noise in (4) is 
particularly excluded due to the nonexistence of a solution. 

Based on these observations, an immediate approach to generalize a sharp 
Kramers' formula to neural fields seems unlikely. Hence we try to understand an 
LDP for the stochastic Amari-type model (2). 



7 Direct Approach to an LDP 

A general direct approach for the derivation of an LDP for infinite-dimensional 
stochastic evolution equations is presented in [23] and further results have been ob- 
tained for certain additional classes of SPDEs [17-19, 57]. The results in [23] are 
valid for semilinear equations with suitable Lipschitz assumptions on the nonlinear- 
ity and with solutions taking values in C(V). We state the available results applied 
to continuous solutions of the Amari equation (4) assuming that the conditions of 
Lemma 2.1 are satisfied. 

For the following, we assume that there exists an open neighborhood V e C(JS) 
containing a stable equilibrium state u* of the deterministic Amari equation (16) 
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such that V is contained in the basin of attraction of u* . We are interested in the rate 
function and the first-exit time of the process from V given by 

x € v = M{t > 0 : U i V} 

if U starts in the deterministic equilibrium state u*. In order to state the quasipotential 
for u, we consider the control system 

y = -ay + KF(y) + Q 1/2 v, y 0 = x e C(B) (44) 

for controls v e L 2 ((0,T), L 2 (B)) for all T > 0 and denote by y x,v its unique mild 
solution 2 taking values in C([0, T], C(B)) for all T > 0. Then we define 

/(«*, z) = inf J 1 jf I || \ 2 ds : y w *' u (r) = z, T > 0 j, (45) 

where this quasipotential relates to the minimal energy necessary to move the control 
system (44) started at the equilibrium state w* to z- 

Theorem 7.1 ([23, Theorem 12.18]) It holds that 

lim6 2 lnErr| ) |[/ 0 = w*l = inf l(u*,z). 
€^0 L u J zedV v 7 

Following further the exposition in [23, Sect. 12] explicit formulae for the rate 
function / are only available in the special case of the drift possessing gradient struc- 
ture and space-time white noise. As we have argued above, this structure is partic- 
ularly not satisfied for neural field equations. Hence, the same observations as pre- 
sented at the end of the last section prevent a further direct analytic approach to the 
LDP. Therefore, we try to understand the LDP problem for a discretized approximate 
finite-dimensional version of the neural field equation. 



2 The existence of such a solution is guaranteed by standard results on deterministic equations (cf. [23, 
Sect. A.3]) as long as Q 1 / 2 maps L 2 (B) continuously into C(B). This is easily established. The unique 
square root Q 1 / 2 of Q is the Hilbert-Schmidt operator given by Q l / 2 g = M ( v i> 8) v i f° r a U 

g e L 2 (B) and in order to show that Q l ^ 2 g e C(B) it remains to establish that the functions converge 
uniformly on B. This holds as for all x e B 



^2 X i( v i'8) v i( x "> 
i=N 



i=N 



1/2 



i=l 



< | sup 

xeB 



i=l 



1/2 



1/2 / oo \ 1/2 

2 1 



Hence, the upper bound, which is finite due to (7), is in independent of x and converges to zero for N — ► oo. 
Moreover, we further find that g(t) e L 2 ((0, T), L 2 (B)) implies Q l/2 g(t) e L 2 ((0, T), C(B)). 
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8 Galerkin Approximation 

Throughout the section, we assume that the assumptions (H1)-(H3) are satisfied. As 
a discretized version of the neural field equation (2), we consider its spectral Galerkin 
approximations; recall that the solution U t of (2) lies in C([0, T], L 2 (B)) as discussed 
in Sect. 2. In order to decouple the noise, we define the spectral representation of the 
solution 

oo 

U t (x) = J2u\vi(x). (46) 
i=l 

Here, the orthonormal basis functions Vi are given by the eigenfunctions of the co- 
variance operator of the noise with corresponding eigenvalues kj, see Eq. (5). To 
obtain a equation for the coefficients u\, we take the inner product of Eq. (4) with the 
basis functions Vi , which yields 

(dU t9 vt) = [-a(U t , vt) + (KF(U t ), v t )]dt + e(dW t , v t ) for i e N. 

After plugging in (46), we obtain for u l the countable Galerkin system 

du\ = [-otu\ + (KFY (u),u 2 t , . . )]dt + ekidfi} for i e N. (47) 

Here, the nonlinearities coupling all the equations are given by 

(KFY (ulul ")-= j B v iW (j B v>(*> y)f (^ujvj(y)^dy^dx 

due to the symmetry of the kernel w . If, in addition, we assume that (H4) holds and K 
and Q possess the same eigenfunctions and the eigenvalues are related as discussed 
in Sect. 3 the nonlinearities become 

(KFY (ul 9 u*,...)=ki j^i^l u t V J ( X ^J Vi Mdx - (48) 

The Nth Galerkin approximation U N to U is obtained truncating the spectral repre- 
sentation (46), and thus given by 

t//v = (49) 
i=l 

where u\ ,N are the solutions to the N -dimensional Galerkin SDE system 
du\' N = [-au^ N + (KF) i,N (u) ,N , . . . , u? > N )]dt + ckidfii Vi = 1, . . . , N, (50) 
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where the nonlinearities KF l,N are given by 



(K F) i,N (u l ' N , u N ' N ) = j B f(j2 uJ ' Nv J^ (J B w{x > y>i(y)dy}dx (51) 



or, in the special case of Sect. 3, by 



(KF) i,N (u l,N , u N * N ) =kij f ^uJ> N Vj(x)Jvi(x)dx, (52) 

respectively. The following theorem establishes the almost sure convergence of the 
Galerkin approximations to the solution of (4). Therefore, we may be able to in- 
fer properties of the behavior of paths of the solution from the path behavior of the 
Galerkin approximations. We have deferred the proof of the theorem to the Appendix. 



Theorem 8.1 It holds for all T > 0 that 

lim sup \\U t — U^\\ T2 ,i2\ = 0 a.s. 
tf-°°fe[0,n" (« 

If in addition, the series Y^itftf converges in C(B) and the functions Vf are 
Lipschitz continuous with Lipschitz constants Li such that sup xe ^ YlT^i^H^Y x 
I^COI 2 ^ 1- ^ < oo for a p G (0, 1) {i.e., the conditions of Lemma 2.1 are satisfied), 
Uo G C(J3) such that liniA^oo \\U$ — P N Uo\\o = 0 an d K is compact on C(J3), then 
it holds for all T > 0 that 

lim sup II Ut — L = 0 a.s. 

N ^°°te[0,T] 



9 Approximating the LDP 

The LDP in Theorem 7.1 is not immediately computable. Here, we show that a finite- 
dimensional approximation can be made and what the structure of this approximation 
entails. For simplicity, consider the case when the diagonal diffusion matrix £> with 
entries Da = Xj is positive definite, i.e., A; ^ 0 for all i G N. Observe that the inverse 
of D induces an inner product on for N G N U {oo} via 

(a,b) N :=a T (®r l b = [®~ 1/2 a] T [®~ 1/2 b] for a, b e R N , 

where D is understood as the projection onto R iVxiV if N < oo. We are also going 
to use the notation introduced in Sect. 8 for the Galerkin approximation, i.e., u' t ' N 
denotes the vector 

/ 1,N 2,N N,N\T /rox 

(w f ' ,u t , . . . , u t ) G K (53) 
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where u' t ' N denotes the solutions of the N -dimensional system (50). Note that 
throughout this section we shall always work with the Galerkin coefficients, e.g., 
u t refers to the vector 

(ulul...f eR°°. 

Furthermore, for arbitrary functions fa e L 2 (B), which are used in the formulation 
of the rate function, we use the notation fa t " to denote the projection onto the first N 
Galerkin coefficients. Theorem 5.1 immediately implies the following: 



Proposition 9.1 For the finite-dimensional Galerkin system (50) the rate function is 
given by 



i N (<I>' N ) = 



if<S>euf + H», (54) 
+oo, otherwise, 



where g>> N (<t>f ) = -aft" + (KF)><» 4>?'")- 

Recall from Sect. 7 that Theorem 7.1 provides a large deviation principle. For the 
case when Q is a positive operator, we may formally rewrite the control system (44) 
as 

®~ 1/2 [y - (~ay + KF(y))] = v (55) 
so that the rate function for the Amari model can be expressed as 



7(0) 



\ fo Is £" 1/2 [0; - S(0r)]£~ 1/2 [0; - g(4h)]dxdt, 

if (j>eu 0 + H™, (56) 
+oo, otherwise, 



where g(fa) = -a fa + KF(fa) and T)~ 1/2 u = YaL\(®~ 1,2u ' v i) v i- Therefore, the 
next result just implies that the Galerkin approximation is consistent for the LDP. 

Proposition 9.2 For each fa e uo + we have limjv_»oo \I(fa) — I N (fa t ,N )\ = 0. 
Proof Considering the finite-dimensional rate function (54) it suffices to notice that 

((^)'-^(^).(^)'-^(^)> w 

by orthonormality of the basis in L 2 (B). □ 
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Hence, we may work with the finite-dimensional Galerkin system and its LDP for 
computational purposes. However, the truncation N may still be very large. We are 
going to show, using a formal analysis for a certain case, that there is an intrinsic 
multiscale structure of the rate function. We assume that we are in the special case 
considered in Sect. 3 where K and Q have the same eigenfunctions and the corre- 
sponding eigenvalues are given by A; and A?, respectively. 

Lemma 9.1 For each iVeN, the first part of the rate function (54) can be rewritten 
as 

1 r T 

I N U-> N ) = - ai -2a 2 + a 3 dt (57) 

2 Jo 

where the three terms are given by 

a l = {(fit*)' + a fc N i Ofr^ + a( t>t N ) N , 

and (KF) i,N = ±(KFY> N . 

Proof For notational simplicity, we shall temporarily omit in this proof the subscript 
for the inner product (•, -)/y = (•, •) as well as the Galerkin index, e.g., <p t ' = (fit as 
it is understood that we work with N -dimensional vectors in this proof. Consider the 
following general calculation: 

= (0;, 0;) - 2(0;, g(cfit)) + {g(cfit), g((fit)) 
= (0;, 0;) + 2a% cfit) - 2(0;, KF(cfi t )) 

+ (KF(<fi t ), KF((fi t )) + a 2 ((fit, (fit) ~ 2a(0„ KF(<fi t )) 
= (0; + oufiu <fi ! t + acfit) - 2(0; + acfiu KF{(fi t )) + KF{(fi t ) T KF{<fi t ) 

and observe that the result is independent of N. □ 

It is important to point out that the LDP from Theorem 5.1 requires the infimum 
of the rate function. From Lemma 9.1, we know that the rate function splits into three 
terms. The three terms are interesting in the asymptotic limit N — > 00. Suppose 

(4>i N y + a<t> t ' N = 0(K(N)) and KF' N (ft N ) = 0(r, (AO) 
as N — > 00 for some nonnegative functions k, r\. Then Lemma 9.1 yields 

< = 0(^(;V) 2 X- 2 ), a% = 0{k(NMN)X- 1 ), a" = 0( V (N) 2 ). 
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Lemma 9.2 Suppose there exists a positive constant K f such that 

sup|/(*)|< (58) 

xeR 

then rj(N) = 1. 



Proof A direct estimate yields 



|*V'>;>")| <f 

JB 



\i=l 



Kf \ \vj( 
JB 



\vj(x)\ dx < Kf J \vj(x)\dx. 



Since H^;|lL 2 (i3) = 1 an d L 2 (B) L l (B), the last integral is uniformly bounded 
over j g N by meas(S) 1 / 2 . □ 



We remark that several typical functions / discussed in Sect. 2 such as f(u) = 
(1 + e _w ) _1 and f(u) = (tanh(w) + l)/2 are globally bounded so that Lemma 9.2 
does apply to many practical cases. In this situation, we get that 

a? = 0(k(N) 2 X~ 2 ), a% = G(k(N)X- 1 ), a% = (9(1). 

We make a case distinction between the different relative asymptotics of k (N) and 
Xn. Note that the following asymptotic relations are purely formal: 

- If k(N) <C X N or k(N) ~ X N as N oo, then we can conclude that k(AT) 0, 
i.e., 

^N,Ny^ a(j) N,N^ 0 asA ^^ 0 (59) 

since for trace-class noise we know that X^ — > 0. If we formally require that 
(4>^ ,N ) f + ot(p^ ,N = 0 for TV sufficiently large, then the higher-order Galerkin 
modes decays exponentially in time 

<t>t = 0 O e • 

- If /c(7V) ^> A^y as N -> oo, then a\ ^> —2^2 + #3 and the first term dominates 
the asymptotics. But a^ > 0 for all TV so that the rate function only has a finite 
infimum ifa^^OasAf^ 00. This implies again that (59) holds for the case of 
a finite infimum. 

Hence, we get in many reasonable first-exit problems for the Amari model with 
trace-class noise that there is a finite set for n < N of "slow" or "center-like" di- 
rections and an infinite set of "fast" or "stable" directions for n > N. Although we 
have made this observation from the rate function alone, it is entirely natural con- 
sidering the structure of the Galerkin approximation. Indeed, for the case when the 
eigenvalues of K and Q are related, we may write (50) as 

du\' N = (-au\' N + Xi [■ • • ])dt + eXidp\ (60) 
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so that for bounded nonlinearity /, which is represented in the terms [• • • ] in (60), 
the higher-order modes should really just be governed by du l t ,N = —au\ ,N dt. 

Hence, Propositions 9.1-9.2 and the multiscale nature of the problem induced by 
the trace-class noise suggests a procedure how to approximate the rate function and 
the associated LDP in practice. In particular, we may compute the eigenvalues and 
eigenf unctions of K and Q up to a sufficiently large given order N*. This yields an 
explicit representation of the Galerkin system and the associated rate function. Then 
one may apply any finite-dimensional technique to understand the rate function. One 
may even find a better truncation order N < N* based on the knowledge that the min- 
imizer of the rate function must have components that decay (almost) exponentially 
in time for orders bigger than N. 

10 Outlook 

In this paper, we have discussed several steps toward a better understanding of noise- 
induced transitions in continuum neural fields. Although we have provided the main 
basic elements via the LDP and finite-dimensional approximations, there are still 
several very interesting open problems. 

We have demonstrated that a sharp Kramers' rate calculation for neural fields 
with trace-class noise is very challenging as the techniques for white-noise gradient- 
structure SPDEs cannot be applied directly. However, we have seen in Sect. 4 that the 
deterministic dynamics for neural fields frequently exhibits a classical bistable struc- 
ture with a saddle-state between stable equilibria. This suggests that there should 
be a Kramers' law with exponential scaling in the noise intensity as well as a pre- 
cisely computable pre-factor. It is interesting to ask how this pre-factor depends on 
the eigenvalues of the trace-class operator Q defining the g-Wiener process. We ex- 
pect that new technical tools are needed to answer this question. 

From the viewpoint of experimental data, the exponential scaling for the LDP is 
relevant as it shows that noise-induced transitions have exponential interarrival times. 
This leads to the possibility that working memory as well as perceptual bistability 
could be governed by a Poisson process. However, the same phenomena could also 
be governed by a slowly varying variable, i.e., by an adaptive neural field [14]; the 
"fast" activity variable U in the Amari model is augmented by one or more "slow" 
variables. In this context, the required assumptions on the equilibrium structure in 
Sect. 4 and the noise in Sect. 3 is not necessary to produce a bistable switch and 
the fast variable U can, e.g., just have a single deterministically unstable equilibrium 
and bistable, nonrandom switching between metastable states may occur. Of course, 
there is also the possibility that an intermediate regime between noise-induced and 
deterministic escape is relevant [53]. 

It is interesting to note that the same problem arises generically across many nat- 
ural sciences in the study of critical transitions (or "tipping points") [48, 59]. The 
question which escape mechanism from a metastable state matches the data is of- 
ten discussed very controversially and we shall not aim to provide a discussion here. 
However, our main goal to make the LDP and its associated rate functional as explicit 
as possible should definitely help to simplify comparison between models and exper- 
iment. For example, a parameter study or data assimilation for the finite-dimensional 
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Galerkin system considered in Theorem 8.1 and the associated rate function in Propo- 
sition 9.1 are often easier than working directly with the abstract solutions of the 
stochastic Amari model in C([0, T], L 2 (B)). 

To study the parameter dependence is an interesting open question, which we aim 
to address in future work. In particular, the next step is to use the Galerkin approx- 
imations in Sect. 8 and the associated LDP in Sect. 9 for numerical purposes [49]. 
Recent work for SPDEs [8] suggests that a spectral method can also be efficient for 
stochastic neural fields. Results on numerical continuation and jump heights for SDEs 
[47] can also be immediately transferred to the spectral approximation, which would 
allow for studies of bifurcations and associated noise-induced phenomena. 

One may also ask how far the technical assumptions we make in this paper can 
be weakened. It is not clear which parts of the global Lipschitz assumptions may be 
replaced by local assumptions or removed altogether. Similar remarks apply to the 
multiscale nature of the problem induced by the decay of the eigenvalues of Q . How 
far this observation can be exploited to derive more efficient analytical as well as 
numerical techniques remains to be investigated. 

On a more abstract level, it would certainly be desirable to extend our basic frame- 
work to other topics that have been considered already for deterministic neural fields. 
A generalization to activity based models with nonlinearity f(fgw(x,y)u(y)dy) 
seems possible. Furthermore, it may be highly desirable to go beyond stationary so- 
lutions and investigate noise-induced switching and transitions for traveling waves 
and patterns. 
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Appendix: Convergence of the Galerkin Approximation 

Proof of Theorem 8.1 We fix a T > 0. Throughout the proof an unspecified norm 
|| • || or operator norm ||| • |||, respectively, are either for the Hilbert space L 2 (B) or 
the Banach space C(JS) and estimates using the unspecified notation are valid in both 
cases. Furthermore, C > 0 denotes an arbitrary deterministic constant, which may 
change from line to line but depend only on T . We begin the proof obtaining an 
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a priori growth bound on the solution of the Amari equation (4). Using the linear 
growth condition on F implied by its Lipschitz continuity, we obtain the estimate 

II U t || < t~ at || Uo\\+C f c- a «- s) (1 + 11^11)^ + 11 O t || . 
Jo 

Due to Gronwall's inequality, there exists a deterministic constant C such that it holds 
almost surely 

sup ||£/H|<c(l + ||£/ 0 ||+ sup ||a||)e cr a.s. (61) 

te[0,T] x te[0,T] 7 

Note that O is an Ornstein-Uhlenbeck process, and it thus holds 

sup || O t || L 2 < oo 
te[0,T] 

almost surely and under the assumptions of Lemma 2.1 in addition 

sup ||0f||o<oo 
te[0,T] 

almost surely. 

Let P N denote the projection operator from L 2 (B) to the subspace spanned by the 
first N basis functions. Then we find that in Hilbert space notation the Nth Galerkin 
approximation satisfies 

jjN = Q -at p N UQ _ { _ (\-^t-s) p N KF f u N\ ds + €0 N m 

Jo 

Here, we use 0 N to be shorthand for the truncated stochastic convolution 

0?:=Yki e-^-^X (62) 

Hence, we obtain for the error of the Galerkin approximation 

U t - U t N = e~ at (Uo - P N U 0 ) + j\- a{t - s \KF{U t ) - P N KF(U t N ))ds 

Jo 

+ e(Ot-0»). 

Adding and subtracting the obvious terms yields for the norm the estimate 

II U t - U t N I < e~ at I U 0 - P N U 0 1 + HI P N K\\\ f e~ a(t - s) \\ F(U S ) - F(U?) \\ds 

Jo 

+ \\\K- P n K\\\ f e-^^ I F(U S ) \\ds + € \\ O t - || , 
Jo 
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where ||| J P iV ^||| L 2 < W^IIl 2 and sup^^ |||P^^r|||o < oo as a consequence of [3, 
Lemma 1 1 . 1 .4] (cf . the application of this result below). Next, using the Lipschitz and 
linear growth conditions on F, applying Gronwall's inequality, taking the supremum 
over all t e [0, T] and estimating using the bound (61) yield 

sup \\U t -U t N \\ 



te[0,T] 
< 



c(\\Uo-P N Uo\\ + \lK-P N K\l(l + \\U 0 \\+ sup ||)) 

v v te[0,T] 77 

+ c( sup \\O t - 0*1). (63) 
\e[0,T] ' 

It remains to show that the individual terms in the right-hand side converge to zero 
for N —> oo almost surely. 

- It clearly holds that || U 0 - P N U 0 \\ L i -> 0 and the convergence || U 0 -P N U 0 \\ 0 ^ 
0 holds by assumption. 

- Next, as argued above (1 + \\Uo\\ + su Pte[0,T] \\Ot\\) * s a - s - finite and the com- 
pactness of the operator K implies \\\K — P^^lll — ► 0 for N -> oo, see [3, 
Lemma 12.1.4]. 

- Finally, the third error term sup rG [ 0 ,r] \\O t — O* || vanishes if the Galerkin approx- 
imations 0 N of the Ornstein-Uhlenbeck process O converge almost surely in the 
spaces C([0, T], L 2 (B)) and C([0, T], C(B)), respectively. This convergence is 
proven in Lemma A. 1 below. 

The proof is completed. □ 



The following lemma contains the convergence of the Galerkin approximation of 
the Ornstein-Uhlenbeck process necessary for proving Theorem 8.1. 

Lemma A.l There exists a sequence > 0 with lim/v^cx; = 0 such that for all 
T > 0 and all 8 > 0 there exists a random variable Z$ with E\Z$\ P < oo for all p>\ 
such that 

sup \\O t -0»\\ L2 <Z 8 b l N - 8 
te[0,T] 

almost surely. If in addition, the series Yl^i kfvf converges in C(JS) and the 
functions Vi are Lipschitz continuous with Lipschitz constants Li such that 
sw PxeB^2hLl L? p \ vi(x)\ 2 ( l ~ p ^ < ocfora p G (0, 1), then it further holds that 

sup \\O t -0?\\ Q <Z 6 b 1 - 8 
te[0,T] 

almost surely. 

Remark A.l Assumptions on the speed of convergence of the series Y^=i an d 
Ylh=i tf v f anc * su PxeB Ya^Li kfL? p \vi(x)\ 2 ( l ~ p ) readily yield a rate of convergence 
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for the Galerkin approximation due to the definition of the constants in the proof 
of the lemma. 



Proof of Lemma A.l As in the proof Theorem 8.1 the unspecified norm || • || denotes 
either the norm in L 2 (B) or in C(B) and estimates are valid in both cases. We fix 
T > 0, p G (0, 1) and a p e N with p > 2d /p. Throughout the proof C > 0 denotes a 
constant that changes from line to line, but depends only on the fixed parameters T, 
p, p, a and the domain B cM. d . 

Then we obtain for all N, M e N with M < N using the factorization method (cf. 
[23, Sect. 5.3]) similarly to the proof of [8, Lemma 5.6] the estimate 

(E sup \\0» -Of\\ p ) l,P <C sup (E\\Y t M ' N \\ p ) l/p , 
V te[0,T] ' te[0,T] 

where Y t N ' M is the process defined by 

i=M+l J ° 

In order to estimate the pth mean of the process Y M,N , we proceed separately for the 
two cases L 2 (B) and C(B). 



The case of L 2 (B): Due to the orthogonality of the basis functions and employing 
Holder's inequality, one obtains 



E\Y t M - N l P L2 

/ / N \ (P"2)//> / N , t , p \ 2/p\ p/2 

' Af \<J>-2)/2 N 

I , — fl/f_Ll \ JO / 



< 



< 



V=M+1 / i=M+\ 
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Next, as the stochastic integrals in the right-hand side are centered Gaussian random 
variables [8, Lemma 5. 2] 3 yields for all t < T 



N 



E|jf'X<C E *? 



V=M+1 
/ N 



^ E *? 



V=M+1 



0?-2)//? # 

I E 

i=M+l 
0?-2)//? # 

I E 

*=Af+l 



t \p/2 

(t-s)- p z- 2a{t - s) ds\ 



0 



\p/2 



\i=M+l / 



/7/2 



Therefore, we obtain for all M, Af e N with M < N 



1 / / N \ 1/2 / 00 

(sup E |r«iy",c £a? He*; 



1/2 



i=M+l 



V=M+1 



where the final upper bound decreases to zero for M —> oo by assumption. 



(64) 



TTie case of C(B)\ In this case the estimates get a bit more involved. As p /2 > d / p 
The continuous embedding of the Sobolev-Slobodeckij space W p / 2,P (B) into C(B) 
(cf. [58, Sect. 2.2.4 and 2.4.4]) and [8, Lemma 5.2] yield the estimates 



sup E\\Yt 
te[0,T] 



n < C sup / / ^ — ^ dxdy 

te[0,T]JBJB 



rM,N / 



x _ -y \d+pp/2 
M.N i 



+ C sup [ E\Y M ' N (x)\ p dx 
te[0,T] JB 



< C 



(E\Y t M > N (x)-Y t M > N (y)\ 2 )P/ 2 



StJbL \x-y\ d +PP/* 

C sup [ (E\Y M > N (x)\ 2 ) p/2 dx. 
te\0,T] JB 



dxdy 



(65) 



We proceed estimating the two expectation terms in the right-hand side. Then we 
obtain for all M < N and all x , y e B for the first term 



E\Y t M ' N (x)-Yr'"(y) 



M,N , 



= E 



£ A/ j (f-5)-^ 2 e-^^(^-W-^(y)) 



i=M+\ 



3 For a centered Gaussian random variable Z, it holds EZ P < p!(EZ 2 )^/ 2 for all p e N. 



Springer 



Page 30 of 33 



C. Kuehn, M.G. Riedler 



E x2 i J s- p e- 2as ds\vi(x)-Vi(y)\ 



i=M+l 

<c E AA p \x-y\ lp 

i=M+l 

for any p e (0, 1) and for the second term 

TV 



(66) 



^\Y^ N {x)\ 2 < f] X 2 f (t - sr^- 2a ^dsvi(x) 2 

i=M + l J ° 
N 

<c e ^w 2 - 



(67) 



i=M+\ 



Next applying the estimates (67) and (66) to the right-hand side of (65) yields, note 
thatpp/2-d > 0, 



( sup e\\y^ m \\ p 0 



(Ef=M +1 ^>-^) p/2 

x — y\d+pp/2 

P/2 \ l/p 



dxdy 



< c 



{Is Is 



\x-y 



\pp/2-d 



dxdy 



I N 



p/2 



+ sup 

\xeB 



< C sup 



E *? u *(*) 2 

i=M+l 

E 

i=M+l 



V=M+1 
W2\ l/p 



+ e 

i=M+l / 



1/2 



for any p e (0, 1). Due to the assumptions of the lemma the two summations in the 
right hand side converge for N -> oo, and thus we obtain for all M, N e N with 
M < N the estimate 



( sup E|y^ M | 0 p ) 1/P <c(sup 



E k i v ^y 



i=M+l 



1/2 



+ E W 



(68) 



i=M+l 



where the right-hand side decreases to zero for M -> oo. 

Overall, we infer from the estimates (64) and (68) that 0 N is a Cauchy-sequence 
in the two spaces C([0, T], L 2 (B)) and C([0,T], C(B)) with respect to convergence 
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in the pth mean and the limit is given by the process O . Moreover, it holds that 

(E sup I O t - O* I p ) 1/P <Cb N V7V g N, (69) 

where the constant C depends only on p but is independent of N and the sequence 
is independent of p and lim^^oo = 0. As we fixed p e N sufficiently large at 
the beginning of the proof, the result (69) holds for all sufficiently large p e N. Then, 
however, Jensen's inequality implies that (69) holds for all p 6 [1, oo). Proceeding 
as in the proof of [44, Lemma 2.1] using the Chebyshev-Markov inequality and the 
Borel-Cantelli lemma, one obtains that there exists for all 8 > 0 a random variable 
Z 8 with E\Z 8 \ p < oo for all p > 1 such that 

sup I O t - I < Z 8 b l ~ 8 almost surely. (70) 

te[0,T] 

The proof is completed. □ 
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